I. Project Overview

Branding, as an important aspect of a company’s value, develops from advertising, product diferrenciation, customer loyalty and many factors. As the social media develops, consumers’ perception of the brand has become an increasingly important factor for valueing a brand, as well as for managment to monitor their performance and apply marketing strategy. This project analyzes the consumers’ perception for Walt Disney Company using Tweeter.

II. Company Overview

The Walt Disney Company is an American diversified multinational mass media and entertainment conglomerate, and is the world’s second largest media conglomerate in terms of revenue. Some of the big events for the company this year is the acquisition of part of 21st Centry Fox, the announcement of the Chiness actress Yifei Liu as cast for the Mulan movie, and the release of the new movie Coco. The project captures and analyzes 15,000 of the golbal tweets in Enlish with keyword “Disney” as of December 14, 2017.

III. Data Collection and Cleaning

Data Collection

The tweeter data was collected with the TwitterR package, which searches for Tweets and the user information through Tweeter App.

The data collection process starts with searching for any tweets with keyword “Disney” from November 11, 2017. The project tries to capture the time trend of consumers’ perception, however, due to the search limits imposed by Disney, the tweets are concentrated on the date of December 14, 2017.

After obtaining tweets data, the user names were used to get user information and espeically the registered location for their account and the coordinates for it.

Data Cleanning

User generated content can bring problems to the data cleaning process. In the user location vector, some of the information is written in non-English character, so to fitler out the invalid data, I used the “tools” package to remove useres with non-ASCII characters.

Data Summary

The data collection started with 150,000 entries of raw tweets data, and ******

# Data Summary
# tweets.df %>% 
#   summarise(
#     Num_Tweets = count(retweetCount),
#             Avg_Num_Retweet = mean(retweetCount), 
#             Num_Locations = count(lon)
#             ) -> s
# s

The following graph maps out the distribution of the gathered tweets data. As shown in the mapp, most of the data points tend to cluster in the US and Europe. This is most likely casued by the official language used in the different parts of the world. So the analysis will further focus on the consumer perception of the Disney brand in these two locations.

Text Analysis

Text analysis further analyzes the occurence of words in the collected tweets, and thusn to analyze consumers’ attitude towards the Disney brand.

Word Frequencies and Word Cloud

The text analysis first started by sparating each tweet into single word, then analyze the frequencies of occurences of different words.

For word cloud, I only care about the tweets people post. To get all words included in the tweets, I used the function unnest_tokens. However there should be nonsense words that do not needed to be analyzed, I use stop_words to eliminate those words. In addition, I add custom stop words specific to this project, like “disney”, which is of course high frequency to this topic, and other nonsense words appear in the list. According to the word cloud due to the data I just filtered out, it seems that 21st Century Fox is the hottest topic, which relates to the acquisition. Other high frequency words are like “star”, “christmas”, “wars”, “beautiful”, and so on.

Sentiment Analysis

Sentiment analysis is used to to estimat the overall positivity of consumers’ attitude towards Disney. The sentiment scores are calculated as the count of positive key words minus the count of negative key words as indicated in the word list “bing”.

## # A tibble: 2,476 x 2
##          word score
##         <chr> <int>
##  1    abandon    -2
##  2  abandoned    -2
##  3   abandons    -2
##  4   abducted    -2
##  5  abduction    -2
##  6 abductions    -2
##  7      abhor    -3
##  8   abhorred    -3
##  9  abhorrent    -3
## 10     abhors    -3
## # ... with 2,466 more rows
## # A tibble: 6,788 x 2
##           word sentiment
##          <chr>     <chr>
##  1     2-faced  negative
##  2     2-faces  negative
##  3          a+  positive
##  4    abnormal  negative
##  5     abolish  negative
##  6  abominable  negative
##  7  abominably  negative
##  8   abominate  negative
##  9 abomination  negative
## 10       abort  negative
## # ... with 6,778 more rows
## # A tibble: 13,901 x 2
##           word sentiment
##          <chr>     <chr>
##  1      abacus     trust
##  2     abandon      fear
##  3     abandon  negative
##  4     abandon   sadness
##  5   abandoned     anger
##  6   abandoned      fear
##  7   abandoned  negative
##  8   abandoned   sadness
##  9 abandonment     anger
## 10 abandonment      fear
## # ... with 13,891 more rows

After calculating sentiment scores for each Tweeter user, the following grphs were made to analyze the overall sentiment of consumer in the US and Europe.

Each Tweeter user is mapped so that the size of the point represents the sentiment score. As it is shown in the map, US consumers outnumbered the Europe consumers, and generally have a higher sentiment score, which indicates that Disney is more recognized in the US than in Europe.

To analyze the sentiment score across the US, the following graphs are drawn to compare sentiment scores acrosse different states.

Also, to customise the graph and accomodate more flexible query, a Shiny App is made from the sentiment scores at the following link: https://yutingma.shinyapps.io/us_sentiment/ https://yutingma.shinyapps.io/sentiment_by_state/ (For the Sentiment by State Interactive Chart, the maps for 50 states take some time to load, so error message may occur at the beginning. )